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Abstract. Given an imprecise probabilistic model over a continuous space, 
computing lower/upper expectations is often computationally hard to achieve, 
even in simple cases. Because expectations are essential in decision making 
and risk analysis, tractable methods to compute them are crucial in many ap- 
plications involving imprecise probabilistic models. We concentrate on p-boxes 
(a simple and popular model), and on the computation of lower expectations 
of non-monotone functions. This paper is devoted to the univariate case, that 
is where only one variable has uncertainty. We propose and compare two 
approaches : the first using general linear programming, and the second us- 
ing the fact that p-boxes are special cases of random sets. We underline the 
complementarity of both approaches, as well as the differences. 



1. Introduction 

There are many situations where a unique probabihty distribution cannot be 
identified to describe our uncertainty about the value assumed by a variable on a 
state space. This can happen for example when data or expert judgments are not 
sufficient and/or are confiicting. In such solution is to model information by 

the means of imprecise probabilities, that is by considering either sets of probabil- 
ity distributions [I7l[i4j or bounds on expectations [18]. Note that, from a purely 
mathematical point of view, such representations encompass many other frame- 
works dealing with the representation of incomplete and confiicting information, 
such as random sets [7j and possibility theory jT2] . 

When considering such models, the expectation of a real- valued bounded func- 
tion over the state space is no longer precise and is lower- and upper-bounded by 
some value. In applications involving risk analysis or decision making, the decision 
process will be based on the values of these lower and upper expectations, using 
extensions of the classical expected utility criterion [25]. When the state space on 
which the variable assumes its value is finite, lower and upper expectations can be 
numerically computed by using, for instance, linear programming techniques [26] . 
The problem becomes quite more complicated when uncertainty models are defined 
over infinite state spaces (e.g., the real line, product spaces, . . . ). 

In this latter case, computing exactly and analytically the lower and upper 
expectations of a given function is impossible most of the time, and there are 
very few methods and algorithms around to compute approximations of these 
bounds [H [211 [24]. In this paper, we study such analytical solutions for a specific 
case, that is the one where the uncertainty over a variable is described by a pair of 
upper and lower cumulative distributions (a so-called p-box [13]). In essence, such 
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a study comes down to search the extremal points of the p-box for which the expec- 
tation bounds are reached. The features of these solutions also allow us to suggest 
some ways to build more efRcient numerical methods and algorithms, useful when 
analytical solutions cannot be computed. We also assume that the function over 
which lower and upper expectations have to be computed can be non-monotone but 
has a (partially) known behaviour. In this paper, we concentrate on the univariate 
case, i.e., where the value assumed by only one variable is tainted with uncertainty. 
The multivariate case as well as the case of mixed strategies (expectation bounds 
computed over mixture of functions) are left for forthcoming papers. 

P-boxes are one of the simplest and most popular models of sets of probability 
distributions, directly extending cumulative distributions used in the precise case. 
P-boxes are often used in applications [16], as they can be easily derived from small 
samples or from expert opinions expressed in terms of imprecise percentiles, 
consequently, our study is likely to be useful in many practical situations. P-box 
models can also be found in robust Bayesian analysis, where they are known as 
distribution band classes [2]. In other cases, the poor expressiveness of p-boxes 
compared to more general sets of probabilities is clearly a Hmitation 0. However, 
as we shall see, their simpHcity allows for more efficient computations, and they 
can provide quick first approximations. Eventually, if these first approximations 
already allow to take a decision, there is no need to consider more complex (and 
computationally demanding) models. 

Methods developed in the paper are based on two difi^erent approaches, and 
we found it interesting to emphasize similarities and differences between these ap- 
proaches, as well as how one approach can help the other: the first is based on 
the fact that the computation of bounding expectations can be viewed as a linear 
programming problem, while the second uses the fact that a p-box is a particular 
case of a random set p!6l IB] . Approximating lower and upper expectations with 
these approaches mainly consists in discretizing the uncertainty models. In this 
sense, they are different from other approaches discretizing the state space [211 [24] . 

We first state the general problem in Section O how to solve it by using linear 
programming and random sets, and introduce the problem of conditioning by an 
observed event. We then study the computation of lower/upper expectations of 
a function over the p-box for different behaviours. Going from the simplest case 
to the most general one, we start with monotone functions in Section [31 pursue 
with functions having one extrema in Section [4l and finish by general (bounded) 
continuous functions in Section [H 



2. General problem statement 

We assume that the information about a (real-valued) random variable X is (or 
can be) represented by a lower F_ and upper F cumulative probability distributions 
defining the p-box [F, F] [13]. Lower F_ and upper F distributions thus define a set 
^{F_,F) of precise distributions such that 

(1) HF,F) = {F\yx e R, F{x) < F{x) < F{x)}. 

Given a function h{X), lower (E) and upper (E) expectations over [F, F] of h{X) 
can be computed by means of a procedure sometimes called natural extension |3Q[ 
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[3T] . which corresponds to the following equations: 



(2) 



E(/i) 



inf _ / h{x)dF,E{h) 

Fe<S>{F,F) Jr 



sup 



hix)dF. 



Computing the lower (resp. upper) expectation can be seen as finding the ex- 
tremizing distribution F inside F) reaching the infimum (resp. supremum) in 
Equations ([2]). If we consider the convex set of probabilities induced by ^{F_,F), 
this is equivalent to find the extremum point (i.e., vertex) of this convex set where 
the bounds are reached, among all vertices (here infinitely many). Solving Equa- 
tions ^ exactly is usually very difficult, although sometimes possible, even when 
analytical expressions of h,F,F_ are known. In practice, numerical methods must 
often be used to solve the problem and estimate both the upper and lower ex- 
pectations. Upper and lower expectations are dual [31^ ch.2.], in the sense that 
E(/i) — — E(— /i). This will allow us to concentrate only on the lower expectations 
for some cases studied in the sequel. We now detail the two generic approaches 
used throughout the paper to solve the above problem. Note that, through all the 
paper, we assume that we restrict ourselves either to cr-additive probabilities or to 
continuous functions h, as such assumptions are not, from a practical standpoint, 
very limiting. 

We will denote by Ia the indicator function of the set A, that is the function such 
that Ia{x) = 1 if a; G ^, zero otherwise. The lower (resp. upper) expectation of 
this function, E{Ia) (resp. E(/^)), have the same value as the lower (resp. upper) 
probability P,{A) (resp. P{A) of the event A induced by the set $(F,F). 

2.1. Linear programming view. Although we assume that the readers have ba- 
sic knowledge of linear programming (for an introduction to the topic, see for ex- 
ample Vanderbei [l^), we will recall basic results coming from this theory when 
they are used in the paper. 

As sets of probabilities can be expressed through linear constraints over expec- 
tations, and as expectation is a linear functional, it is quite natural to translate 
Equations |[2|) into linear programs. The linear programs corresponding to lower 
expectation are summarized below. 



Primal problem: 



Dual problem: 



Min. 



/ h (x) p (x) dx 



Max. 



C0+ / {-c{t)F{t) + dit)F{t))dt 



subject to 

oo 

p{x)>0, J p{x)dx=l, 

— oo 

X 

- J p{x)dx> -F (x) , 

— OO 
X 

J p{x)dx >F{x). 



subject to 

oo 

Co + / (-C {t) + d (t)) dt<h {x) , 

X 

Co £ R,c(x) > 0,d{x) > 0. 



Where v and w are the objective functions to respectively minimize and maximize 
for the primal and dual problems, and p (x) is a probability density function having 
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a cumulative distribution inside F). Since both the primal and dual problems 
are feasible (i.e. have solutions satisfying their constraints), then their optimal 
solutions coincide (due to strong duality \29\ Ch.5]) and are equal to IE(/i). 

Numerically solving the above problem can be done by approximating the prob- 
ability distribution function F by a set of TV points F(xi), i = 1,...,N, and by 
translating equations ^ into the corresponding linear programming problem with 
N optimization variables and where constraints correspond to equation ^ . Those 
linear programming problems are of the form 

N N 

(3) E*(ft,) = inf ^^/i(a;fc)zfe or E (h) = sup y~^/i(xfc)zfc 

k=l k=l 

subject to 

N 

> 0, i = 1,...,N, ^zfe = 1, 

fc=i 

i i 

^ Zfc < F(x,), ^ Zfe > F{x,), i = 1, N. 

k=l k=l 

where the Zk are the optimization variables, and objective function E*(h) (resp. 
E {h)) is an approximation of the lower (resp. upper) expectation. Note that the 
primal problem may not always be feasible (e.g., consider = 1 and F{xi) — 
F_{xi) < 1) if TV is too small or values Xi are badly chosen. Also, the inequality 
E(/i) < E*(/i) (or its converse) does not always hold when solving the above dis- 
cretized problem. The approximated solution E* is thus not a guaranteed inner or 
outer approximation. A solution to obtain a guaranteed inner approximation is to 
replace, for i = 1, . . . , N , F_(xi) by F_{xi+i) in constraints X]fc=i ^fe — SLi^i)^ with 
F_{xn+i) = 1, since in this case, any solution to the linear program would be such 
that, for any x G [xi,Xi+i\, 

i 

F{x) < F{x,+i) < ^ Zfe < F{x^) < F{x), 

k=l 

consequently the (discrete) cumulative distributions formed by the values Zk, k = 
1, . . . ,TV is in ^{F_,F). However, for this linear program to have a solution, we 
must be able to choose the Xi, i = 1, . . . , N on M. such that F(xi) > F_{xi+i). In 
addition to not be always possible, this puts necessary constraints over the chosen 
discretization of R. 

Let us write now the dual linear programming problem for computing E**(ft-), 
taking points yi different from Xi, 

(4) E**{h) = max (^co + ^ {d,F{y,) - c,F(yO) j 
subject to Co e M, Ci > 0, di > 0, and 

N 

Co + ^ {dk - Ck) < h{yi), i = 1, iV, 

k—i 

where cq, c^, di are the optimization variables, jji — {xi-i + Xi)/2. 
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When both problems are discretized, equality between their optimal solutions 
no longer holds, but converge towards the same value as N grows. To approximate 
the solution, one can let N grow iteratively until the difference |E*(/i) — E**(/i)| 
is smaller than a given value e > characterizing the accuracy of the solutions. 
However, this way of determining the lower and upper expectations meets some 
computation difficulties if many iterations are needed and if the value of N is 
rather large. Indeed, the primal optimization problem have N variables and 3A^+ 1 
constraints. On the other hand, solving the primal and dual approximated problems 
only once with a small value of N can lead to bad approximations of the exact value. 
Also important is the question of how to choose or sample the values Xi to improve 
numerical convergence? In other words, is there some regions that should be more 
sampled than others. A generic algorithm (for E) would look as follows: 

(1) Fix a precision threshold e and an initial value of N 

(2) Sample N values Xi s.t. F{xi) > and F_{xi) < 1 

(3) Compute E*(/i) and E**(/i) 

(4) If |E*(/i) — E**{h)\ < e, stop, else increase N and return to step 2. 

In the sequel, we will see that knowing h and its behaviour can significantly 
improve both accuracy and efficiency of expectation bound computations. It also 
provides some insight as to how values Xi could be sampled. 

2.2. Random set view. Now that we have given a global sketch of the linear 
programming approach, we can detail the one using random sets. Formally, a 
random set is a mapping F from a probability space to the power set p{X) of 
another space X, also called a multi-valued mapping. This mapping induces lower 
and upper probabilities on X [7]. Here, we consider the unit interval [0, 1] equipped 
with Lebesgue measure as the probability space, and piX) are the measurable 
subsets of the real line R. 

Given the p-box [F, F], we will denote — [a*^, a*] the set such that 



a-. 



■ — 



a. 



'7 



sup{a; e R : F{x) < 7} = F (7), 
: inf {x e M : F{x) > 7} = (7), 



1 " 




7 ■ 



a-. 



R 



Figure 1. P-box as random set, illustration 
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By extending existing results [HI [13] to the continuous real line [Qj Jj, we can 
conclude that the p-box is equivalent to the continuous random set with a 

uniform mass density on [0, 1] and a mapping (see figure [T]) such that 

r(7) = A^ = [a,^,a;], 7e[0,l]. 

Note that both F (7), ^^"^(7) are non-decreasing functions of 7. The interest of 
this mapping F is that it allows us to rewrite equations ([2]) in the following form: 

(5) E{h) = I inf h{x) dj, 

Jo 

(6) E{h) ^ [ sup h{x) dj. 

Jo x(EA^ 

Again, finding analytical solutions of such integrals is not easy in the general 
case, but numerical approximations can be computed (with more or less difficulty) 
by discretizing the p-box on a finite number of levels 7^, the main difficulty in the 
general case being to find the infimum or supremum of h{X) for each discretized 
level. Note that, in the finite case, a random set can be represented by non-null 
weights, here denoted m, given to subsets of space X and summing up to one (i.e., 
J2ecx ^i^) — 70 = < 71 < . . . < 7A/ = 1 and define the discrete 

random set F such that for i — 1, . . . , M 

I »77.(A— ) = 7,-7,-1 

We denote by $(F, F)p the set of precise distributions induced by F. This dis- 
cretization, which is an outer approximation of the p-box [K,F] (i-e., ^{£1,F) C 
<i>(F, F)y), is sometimes referred to as the ODM (Outer discretization Method) and 
has been studied by other authors [23]. Working with F, Equations |(5]), ^ can be 
rewritten as 



M _ M 




Let us now define another discrete random set F such that for z = 1, . . . , M 

Y _ f = if «*7i < ^ otherwise 

~'~\ m(A^) = 7i - 7,_i 

We denote by <&(F, F)j, the set of precise distributions induced by F. F is an inner 
approximation of the p-box (i.e., ^{F_,F)-p C ^iF_iF)), and Equations |[5]), ^ can 
again be rewritten 




Note that when there is an index i for which A^. = 0, F does no longer describe a 
non-empty set of probabilities, and we will name such a random set inconsistent. 
This case can be compared to the case when the linear program giving guaranteed 
inner approximation has no feasible solutions. 

We have that E^{h) < E{h) < E^h) (due to inclusions $(Z,^)r ^ 'i>{F,F) C 
^{E.>F)t )■ Thus, to approximate the solution we can again let M grow until 



EXPECTATIONS AND P-BOXES 



7 



\EF{h) — E— is smaller than a given accuracy e > 0. As in the case of Hn- 
ear programming, choosing too few levels 7^ or using poor heuristics to find the 
infinimum/supremum over sets can lead to bad approximations, and if those infin- 
imum/supremum are hard to find, computational difficulties can arise. A generic 
algorithm (for E) using random sets would be as follows 

(1) Fix a precision threshold e and an initial value of M 

(2) Sample M values ji 

(3) Compute E^(/i) andE^{h) 

(4) If |E {h) — E—{h)\ < e, stop, else increase M and return to step 2. 

Note that the distance between two consecutive 7^,7^+1 does not have to be con- 
stant. If r is inconsistent, an alternative is to use one of the two random sets Fi, r2 
such that for i — 1, . . . , AI 

^ '~ \ m{A^. ^) = 7^ - 7^-1, ^ '~ \ m{A^^^^) = -fi - 7^-1. 
The corresponding approximations read, for j ~ 1, 2, 




Compared to F, Fi , F2 have the advantage to always be consistent, but the obtained 
approximations can either outer- or inner-approximate the exact values, even if they 
converge towards it as M increases. 



2.3. Conditional lower/upper expectations. Another quite common problem 
when deahng with imprecise probabilities is the procedure of conditioning and the 
computations of associated lower/upper conditional expectations. Suppose that we 
observe an event B = [60 7^1]- Then the lower and upper conditional expectations, 
given the p-box [F, F] and under condition of B, can be determined as follows: 

E(h B) = inf ' , 

' ' F<F<F I^lB{x)dF ' 

Wfun^ J^h{x)lB{x)dF 
E{h\B) = sup ^ — . 

F<F<F jR-^B[X)dl:< 

The above formulas are equivalent to applying Bayes formula to every probability 
measure inside ^{F_,F), and then retrieving the optimal bounds. Other general- 
isations of Bayes formula to imprecise probabilistic framework exist [HI [31], but 
we will restrict ourselves to the above solution, as it is by far the most used within 
frameworks using lower/upper expectation bounds. Also, we assume that B is 
large enough (or the two distributions [K,F] close enough) so that £(61) > F{bo). 
This is equivalent to require P_{B) > 0, thus avoiding conditioning on an event of 
probability 0. Indeed, there are still some discussions about what should be done 
in presence of such events (see Miranda [18] for an introductory discussion and 
Cozman [5] for possible numerical solutions). 

Similarly to unconditional expectations, the above problems can numerically be 
solved by approximating the probability distribution function by a set of N 
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points F{xi), i = I, A^, and by writing linear-fractional optimization problem^ 
and then associated linear programming problems. Problems mentioned for the 
unconditional case can again occur. The next proposition indicates that previous 
results can be used to provide a more attractive formulation of E(/i|_B), E(ft,|i3). 

Proposition 1. Given a p-box [F_,F], a function h{x) and an event B, the upper 
and lower conditional expectations of h{X) on [F, F] after observing the event B 
can be written 

Eih\B)^ sup_ -^*(a,/3), 

Fibo)<a<Fibo) P " 
F(bi)<!3<F{bi) 



(7) 
(8) 



Mh\B) 



inf 



1 



F{bo)<a<F(bo) P ■ 
F{bi)</3<F{bi) 



-$(a,/3), 



with 



$(a,/5) 



r0 

/ sup h{x)d'-f. 



inf h{x)d'y. 



General proof. We consider only upper expectation. We do not know how the 
extremizing distribution function behaves outside the interval B. Therefore, we 
suppose that the value of the extremizing distribution function at point bg is F{bo) = 
a e [Z(^o),^(^o)] and its value at point bi is F{bi) = /? S [F{bi),F{bi)] (see Fig. 
m. Then there holds 



lBix)dF{x) =(3 -a. 



Hence, we can write 

E{h\B) 



F{bo}<a<F{bo} P 
F(bi)<l3<F{bi) 
F<F<F 



h{x)lB{x)dF{x) 



( 



sup 



1 



F(bo)<a<£(&o) ^ ^ 
F{bi)<l3<F{bi) 



\ 



sup / h{x)lBix)dF{x) 

F<F<F 
> F(bo)=a 
\F(6i)=/3 



(9) 



1 

sup — / sup h{x)d'y. 

bo)<a<F{b„) P ~ ^ xeA^nB 



F{bi)<f3<F{bi) 



By using the results obtained for the unconditional upper expectation, we can see 
that the integrand is equal to 4'(a,/9). The lower expectation is similarly proved. 

□ 



^Problems where the objective function is a fraction of two linear functions and constraints 
are linear. 
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As value f3 — a increases in Equations ©-{H]), so do the numerator and denomi- 
nator, thus playing opposite role in the evolution of the objective function. Hence, 
in order to compute the upper (resp. lower) conditional expectation, one has to 
find the values /3 and a such that any increase (decrease) in the value f3 — a is 
greater (resp. lower) than the corresponding increase (resp. decrease) in "^{a,f3) 
(<!>(«, /?)). 

A crude algorithm to approximate the solution would be to samples different 
values a £ [F{bo),F{bo)] and /? G [F{bi),F{bi)], evaluating Equations for 
all combination [a, /3] and retaining the highest obtained value (note that we can 
have F{bo) > £(6i), hence the need to make sure by adding constraint that [a,f3] 
is not void). 

Another interesting point to note is that the proof takes advantage of both views, 
since the idea to use levels a and (3 comes from fractional linear programming, while 
the final equation ^ can be elegantly formulated by using the random set view. 

In any cases (lower/upper and conditional/unconditional expectations), it is ob- 
vious that the extremizing probability distribution F providing the minimum (resp. 
maximum) expectation of h depends on the form of the function h. If this form 
follows some typical cases, efficient solutions can be found to compute lower (resp. 
upper) expectations. The simplest examples (for which solutions are well known) 
of such typical cases are monotone functions. 



3. The simple case of monotone functions 

We first consider the case where his a monotone function that is non-decreasing 
(resp. non-increasing) in R. We will also introduce the running example used 
throughout the paper. 

3.1. Unconditional expectations. In the case of a monotone non-decreasing 
(resp. non-increasing) function, existing results [31] tell us that we have: 

(10) E{h) = [ h{x)dF [E{h) = / h{x)dF) , 

(11) E{h) = [ h{x)dF ( E(/i) = / h{x)dF) , 



and we see from lfT0|) - (fTT1) that lower and upper expectations are completely de- 
termined by bounding distributions F_ and F. Using equations (HJ-lHl), we get the 
following formulas 

(12) E{h) = h{a,^)dj (^ih) = h{a*)dj 



(13) E{h) = / h{a*)d"f E{h) = / h{a,^)d-f 







which are the counterparts of equations lfTO |) - (fTT|) . Here, expectations are totally de- 
termined by extreme values of the mappings. When h is non-monotone, equations 
(fT0| - lfT3|) only provide inner approximations of E{h),E{h). When using numeri- 
cal procedures over monotone functions, there appears to be no specific sampling 
strategies of values that would allow for faster convergence. 

We now introduce the example that will illustrate our results all along the paper. 
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Example 1. Assume that we have to estimate the loss incurred by the failure 
of a unit of some industrial item. Suppose that this loss is the function of time 
h{x) = 20 — X, and it is known that the unit time to failure is governed by a 
distribution whose bounds are exponential distributions with a failure rate 0.2 and 
0.5 (note that only the bounds are of exponential nature), h is decreasing and 
can, for example, model the fact that the later the unit fails, the less it costs to 
replace it. Let us compute the expected losses as the expectation of h. The lower 
and upper distribution functions of the unit time to failure are 1 — cxp(— 0.2a;) and 
1 — exp(— 0.5a;), respectively. Hence 

E{h) = / (20 - a;)d(l - exp(-0.5a;)) = / (20 - a;)0.5e~°-^''da; = 18, 
Jo Jo 

/•OO poo 

l(/i) = / (20 - a;)d(l - exp(-0.2a;)) = / (20 - a;)0.2e"°-^"^da; = 15. 
Jq Jq 

Finally, we obtain that the expected losses are in the interval [15, 18]. 

Let us use the random set approach. Since F (7) = — 21n(l — 7) = a* and 
£""^(7) = -51n(l — 7) = a^-y, then 

E{h)= f (20 + 21n(l-7))d7 = 18, 
Jo 

E{h)^ [ (20 + 51n(l-7))d7 = 15. 
Jo 

We get the same values of the lower and upper expectations of h. 

3.2. Conditional expectations. We now consider that we want to know the lower 
and upper expectations in the case where event B = [bo, bi] occurs. That is, we want 
to compute Equations ^ for a monotone h. Lower and upper expectations 
are then given by the following proposition. 

Proposition 2. Given a p-box [F^,F], a monotone function h{x) and an event B, 
the upper and lower conditional expectation of h{X) on [F, F] after observing the 
event B can be written 

1 fl^ 

E{h\B) = sup — / sup /i(a;)d7 

w(hn^<r,<'F(hr,\ P ~ J a x&A^nB 



F(bo)<a<F{bo) ^ J a xGA^nB 

F(bi)</3<F(f)i) 

■ 61 _ ^ 

h{x)dF{x) + hibi) {F{bi) - F{bi)) 



1 



F(bi) - F(bo) \JF-HF{bo)) 
i{h\B)= inf_ / inf h{x)&y 

F(bo)<a<F{bo) P — O: J a xeAjHB 



F{bi)<l3<Fibi) 
1 

F{bi)-F{bo) 



hibo) (Fibo) - F{bo)) + h{x)dFix 



F '{F{b,)) 



bo 
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if h is non- decreasing and 

mB) = ^^^^^ [ ^^^^^ (^h{bo) {F{bo) - F{bo)) + h[x)dF{x)^ , 

Uh\B) = / , ( Kx)dF{x) + h{b,) (Fib,) - F{b,)) I , 

F{bi) - F{b„) \JF-^(F{bo)) J 

if h is non-increasing. 

Proof. We will only prove the upper expectation for non-decreasing function h. 
Lower expectation can be derived likewise, and the case of non-increasing functions 
is then obtained by using duality between lower and upper expectations. 

When h is non-decreasing, we know that sup^^j^_^p,g h{x) is a non-decreasing 
function of 7 that coincides with F_^^ . Using the integral mean value theorem, we 
know that there exists some z G [bo,bi] such that E(/i|i?) = h{z), whatever the 
choice of a,/?. For maximizing E{h\B), values a,P should be chosen so that the 
retained values z and h{z) (coinciding with F_~^) are as high as possible. As h is 
non-decreasing, this corresponds to values a = F{bo), (3 = F(bi), which settles the 
denominator of the objective function. We then have 

r sup h{x)d-f= r _ h{x)dF{x) + h{bi) (F{bi) ~ F{bi)) , 

because for values 7 G [F{bo), F^{bi)], supremum of h{x) on A.y D B is obtained for 
a; = £^"^(7), while for 7 G supremum of /i(a;) = 61. □ 




123456789 10 M 123456789 10 M 

Optimal F for E{h\B) Optimal F for IE(/i|B) 



Figure 2. Conditional expectations with monotone non- 
increasing functions 



Example 2. We consider the same p-box [K,F] and function h as in Example]^ 
but now we consider that we want to know the incurred loss in case x B = [1,8], 
that is the failure is supposed to happen between 1 and 8 units of time. We have 

F{bo) = 1 - cxp(-0.2 • 1) = 0.18, F{bo) = 1 - cxp(-0.5 ■ 1) = 0.39, 



£(61) = 1 - cxp(-0.2 • 8) = 0.8, F(6i) = 1 - cxp(-0.5 • 8) = 0.98, 
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and we get 

_ 1 / fT-^o.s) \ 

WB) = — — — (20 - 1) (0.39 - 0.18) + / (20 - x)0.5e-°-^''dx 

0.8 — 0.18 \ Ji . 



MMB] 



= 18.298, 
1 



0.98-0.39 
14.219. 



(20 - 8) (0.98 - 0.8)+ [ (20 - a;)0.2e"°-2^da: ] 

V, Jf-1(0.39) / 



Note that, if we compare above values with those of Example[^ we have [E(/i), E(/i)] C 
[E{h\B),E{h\B)]. 

The above results indicate that, when h is monotone, computing lower/upper ex- 
pectations exactly remains easy. Also, when using numerical methods, they provide 
insight as to how values should be sampled. For example, when computing upper 
conditional expectation by linear programming, values only need to be sampled 
in [bo,F (foi)], and bo should be among the sampled values, since an important 
probability mass is concentrated at this value (see Fig. [2]). When using random 
set approach and discretizing the unit interval [0,1], one should take 71 = Fbo 
and 72 — F{bo), and not consider finer discretization of this interval, as this would 
not increase the precision. As we shall see, similar results can be derived for more 
complex cases. 

4. Function with one maximum 

In this section, we study the case where the function h has one maximum at 
point a, i.e. h is increasing (resp. decreasing) in (—00, a] (resp. [0,00)). The case 
of h having one minimum follows by considering the function —h and the duality 
between lower and upper expectations. 

4.1. Unconditional expectations. As for monotone h, we first study the case of 
unconditional expectations. Before giving the main result, we show the next lemma 
that will be useful in subsequent proofs. 

Lemma 1. Given a p-box [F, F] and a continuous function h{x) with one maximum 
at X = a, there is always a solution 7 G [Z!(a), -F(a)] to the following equation 



(14) h[F- {^)j=h{F-\j)). 

Proof, let us consider the function 

^{a) = h(F'\a)) -h{F-'ia)), 

which, being a substraction of two continuous functions (by supposition), is con- 
tinuous. Since the function h has its maximum at point x = a, then, by taking 
a — F^{a), we get the inequality 

^{j) = h(F-\Fia)))~h{a)<0 

and, by taking j = F (a), we get the inequality 

^ij)^hia)-h{F-'{F{a)))>0. 
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Optimal F for E(/i) 



Optimal F for E(/i) 



Figure 3. Optimal distributions F with unimodal h 



Consequently, there exists 7 in the interval {F_ (a) , F (a)) such that ip (7) = (since 
ip is continuous). □ 

The next proposition shows that, as for monotone h, the fact of knowing that 
h has one maximum in x = a allows us to derive closed-form expressions of lower 
and upper expectations. The results of the proposition are illustrated in Fig. [3l 

Proposition 3. If the function h has one maximum at point a e M, then the upper 
and lower expectations of h{X) on [F, F] are 



(15) 



E(/i) 



h{x)dF + h{a) [F(a) - F(a) 



h{x)dF, 



(16) E(/i) 
or, equivalently 
(17) 



F '{a} 



h{x)dF + J h{x)dF 



F{a) 1 

E{h)= J h{a*)d-/ + [F{a)-F{a)]h{a)+ J h{a,Jdj 

F{a} 



E{h) = / h{a,^)d-f+ / h{a*)d-f 



(18) 



where a is the solution of equation 

(19) h (F'\a)^ = h {F-\a)) . 

such that a e [F_{a) , F (a)] . 

Proof using linear programming. We assume that the function h (x) is differ- 
entiable in R and has a finite value as x ^ 00. The lower and upper cumulative 
probability functions F_ and F are also assumed to be differentiable. We also con- 
sider the primal and dual problems considered in Section \2l] and recalled below. 
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Primal problem: Dual problem: 



Min. v= / h{x)p{x)dx Max. w = co + / {-c{t) F (t) + d{t) F{t)) dt 

— OO —OO 

subject to subject to 

oo oo 

p{x)>0,J p{x)dx^l, co + J {-c{t) + d{t))dt<h{x), 

— OO X 

X 

- J p{x)dx> -F (x) , Co e K, c (x) >0,d (x) > 0. 

— OO 
X 

J p{x)dx>F{x). 



The proof of Equations ifTS jl - lfTB]) and (fT9| can be separated in three main steps: 

(1) We propose a feasible solution of the primal problem. 

(2) We then consider the feasible solution of the dual problem corresponding 
to the one proposed for the primal problem. 

(3) We show that the two solutions coincide and, therefore, according to the 
basic duality theorem of linear programming, these solutions are optimal 
ones. 

First, we consider the primal problem. Let a' and a" be real values. The function 

{dF(x) /dx, x < a' 
0, a' < X < a" 

d£(.x)/dx, a"<x 

is a feasible solution to the primal problem if the following conditions are respected: 

p (x) dx = 1, 

which, given the above solution, can be rewritten 

' /"OO 

dF+ I dF^l. 



which is equivalent to the equality 

(20) Fia')=F{a"). 

We now interest ourselves in the dual problem. Let us first consider the sole con- 
straint 

POO 

(21) co+ / {-c{t) +d{t))dt < h{x) , 



which is the equivalent of the primal constraint p{x) > 0. We then consider the 
following feasible solution to the dual problem a,s cq = h (oo), 

, , ( h' (x) , X < a' , , , f 0, X < a" 

^ ' [0, X > a ' y —h (x) , x > a 

The inequalities c (x) > and d{x) > are valid provided we have the inequalities 
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a' <a < a" (i.e. interval [a', a"] encompasses maximum of h). By integrating c {x) 
and d (x) , we get the increasing function 

C{x) = - r c{t)dt={ ^<a' 

10, x>a' 



and the decreasing function 

poo 

D (x) = / d{t)dt - 



h {a") - h (oo) , X < a" 
h (x) — h (oo) , X > a" ' 

Let us rewrite condition l(2T|) as follows: 

(22) Co + C (a;) + L» (x) < /i (x) . 
If a; < a', equation (|22l) becomes 

Co + h{x) -h (a') + h (a") - h (oo) <h{x) . 

And, replacing the inequality by an equality (simply taking the upper bound of the 
constraint), we obtain 

(23) h{a")^h{a'). 

\{ a' < X < a", we have cq + h{a") — /i (oo) < h{x) which means that for all 
x G (a', a") we have h (a") (= h (a')) < h (x) (i.e. h (a") and a' are the minimal 
values of the function h{x) in interval x G (a',o").) If a; > a", then we get the 
trivial equality co + h{x) — h (oo) = h{x). The two proposed solutions are valid iff 
there exist solutions to Eq. l(20l) and Eq. l(23|) , respectively for the primal and dual 
problem. That such solutions exist can be seen by considering Lemmc[l]and taking 
a' = (7) and a" = (7), with 7 the solution of Eq. ([19]). We then find the 
admissible values of the objective functions 

/•a poo 

^min — 

/ h{x)dF+ / h{x)dF, 

JQ Ja" 
1*00 

Wmax - Co + / (-C {t) F{t)+d (t) F (t)) dt. 







By using integration by parts together with equations (|20l) - l(23|) . we can show that 
equality Wmax — Vmin holds, with 7 the particular solution of equation (flOl) for 
which optimum is reached, as was to be proved. □ 

Proof using random sets. Let us now consider equations ([I])-®. Looking first 
at equation we see that before 7 = F_{a), the supremum of h on is h{a*^), 
since h is increasing between [00, a]. Between 7 = F_(a) and 7 — F{a), the supre- 
mum of h on is /(a). After 7 = -F(a), we can make the same reasoning as for 
the increasing part of h (except that it is now decreasing). Finally, this gives us 
the following formula: 

F(o) F{a) 1 

(24) E{h) = J h{a*)d-f + J h{a)dj + J h{a^^)dj 

E(a) "F(a) 

which is equivalent to (fT7|) . Let us now turn to the lower expectation. Before 
7 = F_{a) and after 7 = F{a), finding the infinimum is again not a problem (it is 
respectively h{a^,^) and h{a*)). Between 7 — F^{a) and 7 = F{a), since we know 
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that h is increasing before x — a and decreasing after, infinimum is either h{af,^) 
or h(a*). This gives us equation 

F(a) F(a) 1 

(25) Eh= J h{a^^)d-/+ J mm{h{a^^),h{a*))d-/+ J h{a*^)dj 

E(a) F(o) 

and if we use equations l(20l) . (|23l) as in the first proof (reasoning used in the first 
proof to show that they have a solution is general, and thus applicable here), we 
know that there is a level a s.t. h{F (a)) = h{F_~^{a)), and for which the above 
equation simplify in equation (fT9|) . □ 



Figure [3] shows that the extremizing distribution corresponding to upper ex- 
pectation consists in concentrating as much probability mass as possible on the 
maximum, as could have been expected, while the cumulative distribution reaching 
the lower expectation consists of an horizontal jump avoiding higher values. As 
we shall see, finding the level a satisfying Equation l(20|) and at which this jump 
occurs is sometimes feasible, and in this case exact lower and upper expectations 
can be found. In other cases, when computing the upper expectation by numerical 
methods and Hnear programming, results indicate that it is important to include 
the value a corresponding to the maximum of h in the sampled value, as well as 
values close to it when computing the upper expectation. When using the random 
set approach, they show that there are no need to consider values 7 inside the in- 
terval [F^{a), F{a)], the bounds being sufficient. For the lower expectation, results 
indicate that when using linear programming, it is preferable to sample outside the 
interval [F~\a),Fr^{a)]. 

However, it can happens that the exact value of a cannot be computed, but 
that the integrals in Eq. lfT5l) - lfT6|) can still be solved. In this case, lower and upper 
expectations have to be approximated, for example by scanning a more or less wide 
range of possible values for a (see [28l for an example) . 

Example 3. We still consider the same p-box as in Example]^ but we now suppose 
that the loss is modelled by the function h{x) — QO — {x ~ 5)^. This loss function 
can express the idea that it is preferable for the unit to fail when it begins to work 
or when it has worked for a long time, rather than when it works at full capacity, 
as the cost of slowing a whole production line would then be quite higher, h has one 
maximum at a = 5, and we get 



Eh = h{5) [F{5) ~ F(5)] + / h{x)dF{x) 

Jo 



h{x)dF{x) 



= 60 • (exp(-0.2 • 5) - exp(-0.5 • 5)) + 31.321 + 4.268 
= 52.736. 

Since F ^(a) = — 21n(l — a) and F^^ {a) = — 51n(l — a), then a can be found by 
solving the following equality 

60 - (-2 ln(l -a)- bf = 60 - (-5 ln(l -a)- bf. 

Hence, we have two solutions a — 1 — exp(— 10/7) and a = 0. Since F ^(0) = 
-F^^(O), then the second solution has to be removed. Therefore, we get a — 1 — 
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exp(— 10/7) = 0.76. Hence, we obtain 

-21n(l-0.76) 



h{x)dF{x)+ / h{x)dF{x) 

-oo ^-51nfl-0.76') 



51n(l-0.76) 
2.85 poo 

(60 ~{x- 5)2) 0.5e-" '^^da; + / (60 - (a; - 5)^) 0.2e-°-2^dx 



7. 14 



29.745. 



Finally, we obtain the interval of expected losses [29.745, 52.736]. Using the random 
set approach, we get 

l-cxp(-0.5-5) 

E(h) = 



{h)= J (60-(-51n(l-7)-5)2)d7 + /i(5)[F(5)-Z(5)] 
(60-(-21n(l-7)-5)2)d7 







l-cxp(-0.2-5) 

52.736. 



0.76 1 

l(/i) = J (60-(-51n(l-7)-5)2)d7+ J (60 - (-2 ln(l - 7) - 5)^) d7 

0.76 

= 29.745. 

If the function h is symmetric about a, i.e., the equality h{a — x) = h{a + x) 
is valid for all a; G R, then the value of a in ifTO]) does not depend on h and is 
determined as 

a-F"\a) =£~^(a) - a. 
Note that expressions lfTO|) . l|lip can be obtained from lfT5|) . l|16p by taking a ^ (x. 

4.2. Conditional expectations. We now consider conditioning by an event B = 
[60,^1], while h is still assumed to have one maximum. The following proposition 
indicates how lower and upper conditional expectations can be computed in this 
case. 

Proposition 4. // the function h has one maximum at point a G R, then the upper 
and lower conditional expectations of h{X) on [F, F] after observing the event B 
are 

E{h\B)= sup_ -^^-(0,/?), 

F(bo)<a<F(bo) P " 
F(bi)<l3<F(bi) 

E{h\B)^ inf_ — ^$(a,/3), 

F{bo)<a<F(bo) P — Oi 
F{bi)<p<F(bi) 
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with 

+ h{a) (niin(F(a), /3) - niax(F(a), a)) 

_ fF-\e) _ 

J bo 

+ /i(6i)(/3-£(6i))+ / ' h{x)dF 
Jet He) 

Here I(a<b) is the indicator function taking 1 if a < b and if a > b; s is one of 
the roots of the following equation: 

(26) h(F-\e))^h{F-\e)). 

Proof. The proof follows from Proposition [T] where ^'(a, (3), (3) are respectively 
replaced by formulas given in Proposition [31 □ 

Example 4. We consider the same h as in Example[^ the same p-box [F^,F] as 
in the other examples, and the conditioning event B = [1,8]. From Example\^ the 
solutions ofEq. ^ are s ^ l-exp(-10/7) = 0.76, F^^e) = 7.14, = 2.85. 

We also have a = 5, F(a) = 1 - cxp(-0.2 ■ 5) = 0.63, F{a) = 1 - cxp(-0.5 • 5) = 
0.92. Let us first concentrate on 

E{h\B)= sup --^^{a,(3), 

0.18<Q<0.39 P — a 
0.8</3<0.98 

where 

^{a, (3) = /(„<o.63) / (60 -{x- 5)2) 0.2e-°-^^dx 

J-51n(l-Q) 
.-2 1n(l-/3) 

+ /(/3>o.92) J (60 -{x- 5)2) 0.5e-°-5-dx 

+ 60 (min(l - e~•'■^■^ /3) - max(l - e'°-^-\a)) 

= (25a In^ (1 - a) - 25 In^ (1 - a) - 35a + 31.32) + 60 (min (0.92, /3) - 0.63) 
+ A^>o.92) (4 (1 - f3) In^ (1 - /3) + 12 (1 - /3) In (1 - /3) + 47/3 - 42.73) 

since 0.18 < a < 0.39, we have I(a<o.63) ~ 1- Let us then consider the two sets of 
value [0.8, 0.92] and (0.92, 0.98] for which /(/3>o.92) takes different values, and the 
respective functions ^'i(a, /3),^'2(a, /3) associated to them: 

*i(a,/3) = 25a In^ (1 a) - 25 In^ (1 - a) - 35a + 31.32 + 60 (/3 - 0.63) 

*2(a, f3) = 25a In^ (1 - a) - 25 In^ (1 - a) - 35a + 31.32 

+ 4(1-/3) In^ (1 - /3) + 12 (1 - /3) In (1 - /3) + 47/3 - 42.73 + 17.4 

It can be checked that the derivative d*i(a,/3)/(/3-a)/d;3 is positive for 0.18 < a < 0.39, 
hence the maximum of ^'i(a, /3)/(/3 — a) is achieved at (3 = 0.98. j4/so, since 
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Figure 4. Optimal distribution (thick) for computing upper con- 
ditional expectation on i? = [1,8] 



^'i(a, 0. 98)7(0.98 — a) decreases as a increases, we have 

su^^^,ia,p) . 0:98^*1(0.18,0.98) = 56.52. 

A similar analysis for '^■^i°'^P)/(i3-a) shows that maximum is achieved for a = 0.39, 
/3 = 0.8. Hence 

sup-^M'2(a,/3) = 77^^77^*2(0.39,0.8) = 59.57. 
p ~ a 0.8 — 0.39 

and, finally, we have E,{h\B) = max(56. 52, 59.57) = 59.57. Figure\^ gives an illus- 
tration of the extremizing cumulative distribution for which this upper conditional 
expectation is reached. 

Let us now detail the computations for 

mh\B)= inf 

0.18<a<0.39 13 - a 
0.8</3<0.98 

where 

p2.85 

$(a, f3) = (60 - (1 - 5)2) (0.39 ~ a) + (60 - (a; - 5)^) 0.5e-" '^^da; 

+ (60 - (8 - 5f) {13 - 0.8) + / (60 - {x - 5)^) 0.2e~°-2^dx 
= 51/3 - 44a - 3.54. 

The function -0z^^{ct, P) increases as a increases by arbitrary 0.8 < (3 < 0.98 and 
increases as (3 increases. This implies that¥.{h\B) = 1/(0.8-0. 18) (51 • 0.8 — 44 • 0.18 — 3.54) 
47.32. 

Note that, in the general case, four functions '^i (corresponding to all combina- 
tions of values oi I(^a<F-'^{a)) 5 -^(/3>F~^(a)) i^i^ide {0, 1}2) would have to be considered 
in the computation of E{h\B). Example [4] well illustrates the fact that when h is 
non-monotone, analytical solutions can still be found in some cases, but that they 
tend to become tedious to compute. This will be confirmed in the next section. 
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5. Functions with local maxima/minima 

Now we consider a general form of the function ft-, i.e., the function h{x) has 
alternate local maxima at point a^, i — 1,2, ... and minima at point bi, i = 0,1, 2, 
such that 

(27) bo < ai < bi . . . <bi < a, < b,+i < ... 

Note that, in this case, studying the shape of the extremizing cumulative distribu- 
tion reaching lower expectation is sufficient, thanks to the duality between lower 
and upper expectation. 

Proposition 5. // local maxima (ai) and minima (bi) of the function h satisfy 
condition \21\) . then the extremizing distribution F for computing the lower uncon- 
ditional expectation E(ft,) has discontinuities (vertical jumps) at points bi, i — 1, .... 
of the size 

min (F {bi) , a^+i) - max [F [bi) , ai) . 

Between points bi-i and bi, that is between discontinuities numbered i — 1 and i, 
the extremizing cumulative probability distribution function F is of the form: 

{F{x), X < a' 
a, a' <x < a" , 

F{x), a" <x 

where a is the root of the equation 

h (max (f"' {a) , = h (min {p-^ (a) , b,)) 

in interval [F (aj , F (a^)] , and a' ,a" are such that 

a' — max (f [a] , bi-ij , a" — min (F^^ (a) , 6^) . 

The upper expectation E{h) can be found from the condition E(ft,) = — ]E(— /i). 

Proof using linear programming. This proof is based on the investigation of 
the following local primal and dual optimization problems for computing the lower 
expectation of h in finite interval [bo, bi) where h has one maximum at point oi: 



Primal problem: Dual problem: 

Min. V = Jl^^ h (x) f {x)dx Max. w = -cqF (bo) + d^F (&o) - ciF 

subject to ° +diF (bi) + jj^^^ (-F (x) c{x)+F {x) d (x)) dx 

f (x) > 0, Fo > 0, Fi ^0, subject to 

~Jbo f - Fo > -F {x), e + f^' (-C {t) + d {t)) dt <h {x) , 

Ibo f Po>F{x), e - Co + do + j!'' (-C {t) + d{t)) dt <0, 

-Fo > -F (6o) ,Fo > F {bo) , _e - + d, < 0, 

-Fi > ~F (6i) ,Fi > F , c (x) > 0,Co > 0,ci > 0, 

JII^ f (t) dt + Fp-Fi^Q. d (x) > 0,do > 0,di > 0,e g R 

The optimal solutions of the above problems correspond to the extremizing dis- 
tribution for values x e [bo,bi). Fo := F(&o) and Fi := F(&i) respectively stand for 
the values of the extremizing F in bo and bi . The proof then follows in two main 
steps: 
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Subcase 2.2. Subcase 2.3. 

Figure 5. Four cases of piece- wise extremizing F 



(1) Find optimal solution (that is, propose a feasible solution which coincide for 
both the primal and dual problem) for the above primal and dual problems, 
and consequently the values of the extremizing F between any two local 
minima [5^, 

(2) Show that the combination of these piece-wise extremizing F correspond 
to a cumulative distribution. 

Step (1) of the proof To find optimal solution between x G [6o,6i], we will 
consider every possible cases. First, we can differentiate between two main cases, 
depending on the inequality relation between F (bo) and K{bi)- 

Case 1. F {bo) > £(&i). The optimal solution in this case is of the form: it 
corresponds to the solution / (x) = 0, F (x) = Fq = Fi = a, where a is an arbitrary 
number satisfying the condition < a < F (bo) for the primal problem and to 

the solution c (x) — d (x) — 0, co ^ do = ci = di = e = for the dual problem. See 
Fig. \5\ for an illustration 

Case 2. F (bo) < F_ (bi). This case is similar to the one considered in SectionlH 
since between [bo,bi), h has a maximum for x = ai and is increasing (resp. de- 
creasing) in [6o,ai] (resp. [ai,6i)). We will therefore proceed in the same way as 
in the proof of Proposition [3] to find the optimal solution. First recall (Lemma [J) 
that there is a value a which is a root of the function 

ip{a)^h (max(F~^ (a) ,bo)) - /i(min(F~^ (a),^i)) 
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with a e [F_(o-i) («i)] ■ Three subcases can now occur, depending whether a is 
inside [F {bo) , F^(bi)] or is higher/lower than any value in this interval. We now 
give details about each of these subcases, the reasoning being similar to the one 
in the proof of Proposition [31 All subcases and associated extremizing distribution 
are illustrated in Fig. [5] 

Subcase 2.1. F (bo) < a < F{bi) {a € [F {bo) , F{bi)]). Let us denote a' = 

F (a), a" — F_~^ (a). Then the optimal solution is of the form: 

{dF{x)/Ax, bo < X < a' 
0, a'^x^a" , 

dF{x) /dx, a" <x < bi 

Fo^F{bo), Fi = F{bi). 

This implies that 

F (x) , bo < X < a' 
F{x) = I f{t)dt + Fo = { a, a' x s$ a" . 



bo 



, a" < X <bi 



Let us now give the corresponding solution to the dual problem, and show that 
they are equal. According to relations between primal/dual problem, we have that 
ii a' < X < bi, then c (x) = 0, and if 6o < a; < a", then d (x) = 0. It is obvious that 
do = ci = 0. Consider the constraint 

{-c{t) + d{t))dt < h{x) 



for different intervals of x. 

Let a" < X < hi. Then there holds 

e+ d{t)dt = h{x). 

J X 

Hence d {x) = —h! (x) and e = h (b\). 

Let a' < X < a" . Then the following inequality 

d {t) dt<h {x) 

or h [a") < h {x) has to be vaHd. Indeed, the inequality is valid due to the condition 
h{a') = h{a"). 

Let bo < X < a' . Then 

c{t)dt+ d {t) dt = h (x) 



or 

- f c{t)dt + h{a") = h{x). 

J X 

Hence c (x) = h' {x) . The equality 

r-bi 

e-co + do+ / (-c(t) + d(i))di = 

J bo 

shows that 

h (6i) -co-h {a') + h (bo) - h {bi) + h (a") = 
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and Co = h{bo). It follows from the equality — e — Ci + di =0 that there holds 
di = e = h{bi). In sum, we have 



c{x) 
d{x) = 



h' (.'/;) . ha < x < a' 
0, a' ^x^h ' 



0, 6o < a; < a" 
~h' (x) , a" < X < 6i ' 

Co = ft. (&o) , do = ci = 0, di = e = h (6i) . 
Let us now show that the two obtained solution coincide: 

pa' i-bi 

Zmin= / h{x)dF{x)+ / h{x)dF{x) 

Jbo Ja" 

w^max = -F {bo) h {bo) + F{bi) h {bi) - [ F {x) h' {x)dx - r F{x)h' {x)dz 

Jbo Ja" 



or 



Wmax = -F {bo) h {bo) + F {bi) h (6i) 

+ [ h {x)dF {x) - F {a') h {a') + F {bo) h {bo) 

Jbo 

+ [ \{x)dF {x) - F h{bi) + F {a") h {a") 

Ja" 

— Zmin- 

Hence the proposed solution is the optimal one. 

Subcase 2.2. a > F{bi) {[F {bo) , F{bi)] < a). Denote a' = F~^ {a). Then 
the optimal solution to the initial problem is: 

The corresponding solution for the dual problem is such that if a' < a; < 6i, then 
c{x) = 0, and if 6o < a; < 6i, then d{x) = 0, hence we have do = ci = 0. Again, 
consider the constraint 

fbi 

e+ I {-c{t]+a{t))(lt <. tiix) 



;+ / {-c{t)+d{t))dt <h{a 

J X 



for different intervals. Let a' < x < bi. Then the condition e <h{x) must be vaHd. 
Let ho < X < a' . Then there holds 

c {t) dt=h {x) . 

Consequently, there hold the equalities c{x) = h' {x) and e ~ h{a'). Hence the 
inequality e = h (a') < h {x) is valid for the interval a' < a; < 5i. The equality 

rbi 

e-co + do+ / {-c{t) + d{t))dt = 

Jbo 
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shows that h (a') ~ cq — h {a') + h (bo) — 0, and, therefore, cq — h (bo). It follows 
from the equality —e — ci + di = that there holds di — e = h (a'). In sum, we get 



c(x) 



h' (x) , bo < X < a' 
0, a' ^ a; s$ 5i ' 



d (x) — 0,Cq — h (bo) , do = ci = 0, di = e = h {a') . 
The obtained solutions for the primal and dual problems are such that: 

h{x)dF{x) , 



bo 



-F (bo) h (bo) + F{a')h (a') - / F (x) h' {x)dx 

J bo 



u-max - -F (bo) h (bo) + F(a')h {a') 

+ / h{x)dF{x)-F{a')h{a')+F{bo)h{bo) 

J bo 



Consequently, this is the optimal solution. 

Subcase 2.3. a < F {bo) {a < [F (bo) , F{bi)]). Denote a" = F"^ (F(bo)). 
Then the optimal solution to the primal problem is 

^(^) = | dZ(x)/dx, a"<x<b, , ^o=a, 



F{x) 



a, bo ^ X ^ a" 
F (x) , a" < X <bi 

and the proof is similar to the one of above cases. Optimal shape of F for any 
interval can be obtained by replacing bo and bi by respectively bi and 

in the above proofs, as they are general (as pictured on Fig. [5l. All is left to 
prove is that the concatenated F obtained by the piece-wise extremizing solutions 
is increasing (i.e., that Fi for [bi-i,bi] is lower or equal than Fi for [6^,6^+1]). 

Step (2) of the proof Now we show that the joint extremizing distribution 
function is increasing. Without loss of generality we consider only two intervals 
[bo, bi] and [61, 62]- The maximal value of the function F (x) in the interval [bo, bi] 
is max (F (60) , F_ (61)) for all the cases. The minimal value of the function F (x) in 
the interval [61, 62] is min (F (61) ,£(62)) for all the cases. 

If F(62) >F{bo), then 

min(F(&i),F(62)) > max (F (feo) ,Z (61)) • 

This means that the function is increasing. 

If F(62) < F{bo), then F (61) <F{bo) and we can take F (x) ^ F{bi) for the 
left interval. On the other hand, £(62) < F (bi) and we can take F (x) ~ F (bi) 
for the left interval. It follows from the condition F_{bi) < F (61) that the function 
F (x) is increasing in two neighbour intervals. 

Figure [6] gives an example of a general extremizing distribution. □ 
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Proof using random sets. For convenience, we will consider that h begins with 
a local minimum and ends with a local maximum a„. Formulas when h begins 
(resp. ends) with a local maximum (resp. minimum) are similar. Lower/upper 
expectations can be computed as follows: 

F{b^) 1 



E{h) ~ / mm {h{a^^),h{bi),h{a ))(!"/ + / h{a^^)d'y, 

J biGA^ J 
F{b„) 

F(ai) F(a„) 

E{h) = / h{a*)d'y + / max (/i(a*^), /i(a.i), /i(a*))d7. 

J J aiGA^ 

F(ai) 

We concentrate on the formula giving the lower expectation (details for upper one 
are similar). The most interesting part is the first integral. We consider a particular 
level 7. Let B — {bi, . . . , bj} {i < j) be the set of local minima included in the set 
{B can be empty). bi_i and bj^i are the closest local minima outside A^. We 
then consider the minimal A7 := 7 + ^7 such that min;,.g^^ (/i(a*^), h{bi), h{a*)) ^ 
mmb^izA^.,ih{a^^A'r),h{b^),h{a*^^)) with min^jgAA-, H^) Ha^^A-y) ifmina;^^^ h{x) = 
h{a^,^^) and min^jg^A^ ^{^) 7^ f^{o,*Aj) if niinj^eA^ h{x) — h{a*). As in LP proof, four 
different cases can occur: 
Case A: we have 



and 



min (/i(a,-y), h{bi),h{a*)) = h{bk) 

biGA^ 



min ih{a^,Aj),h{bi),h{a*.)) = h{bk' 

biGA^^ 



with k ^ k' and where h{bk) and h{bk') are respectively the lowest local minima 
of h{x) for X £ A~^ and x S Aa^- That is, probability mass is concentrated on bk 
from 7 to A7, and concentrates on bk' for values 7' > A7. This correspond to Case 
1. of Fig. [5] and of the previous proof. In Fig. [6l it corresponds to the extremizing 
distribution between 62 and 63. 
Case B: we have 

min (/i(a*-y), h{bi), h{a*)) = h{a^^) 

biGA^ 

and 

min {h{a^,A-,),h{bi),h{a*A^)) = h{a*A^). 

bi€A^^ 

This can happen when any local minimum inside A^^Aa-^ is higher than local 
minima just outside it. In this case, it can happen that minimal values stand at 
the bounds of intervals A^i for any 7 < 7' < A7. This corresponds to Case 2.1. 
of Fig. [5] and of the previous proof. In Fig. [6l it corresponds to the extremizing 
distribution between 64 and 65. 
Case C: we have 



hi^A 

and 



mm (h{a^^), /i(6i), /i(a* )) = h{bk) 



min {h{a^,A'-/),h{bi,),h{a*A^)) = h{a*Ay). 

feiGAA-, 
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Figure 6. Example of Optimal F with general h 



With h{bk) the lowest local minima for bk G A^. The minimum shift from the left 
bound of (coinciding with F) to bk. This corresponds to Case 2.2. of Fig. [5] 
and of the previous proof. In Fig. [6l it corresponds to the extremizing distribution 
between 6i and 62. 
Case D: we have 

min (/i(a*-y), h{bi), h{a*)) = ha^,^) 

and 

min {h{a^,Aj), h{bi), h{a*^ )) ^ h(bk'). 

With h{bk') the lowest local minima for bk' € A^j. Situation is similar to the 
previous case, and corresponds to Case 2.3. of Fig. [5] and of the previous proof. In 
Fig. [6l it corresponds to the extremizing distribution between 63 and 64. 

When mmb^^A.,ih{a^'y),h{bi),h{a*)) = mmb^^A&.,ih{a^.^), h{bt), h{a*)) = h{bk) 
with bk G ^-yn^A7, probability mass stay concentrated on 6^, and this corresponds 
to a discontinuity mentioned in Proposition O By letting 7 evolve from to 1, we 
get the extremizing cumulative distribution of Proposition [H □ 

Looking at the extremizing distribution F pictured in Figure [6l we can see that 
computing the lower expectation consists in concentrating probability masses over 
local minima, while giving the less possible amount of probability mass to higher 
values of h{x), as in the case of a function having one maximum. Thus, our results 
confirm what could have intuitively be guessed at first sight. They also give an- 
alytical and computational tools to compute lower and upper expectations. They 
are illustrated in the next example. 

Example 5. We consider the same p-box [F_7F] as in the previous examples (see 
Example Qp. However, we assume that the loss function is of the type h{x) — 
(0.6x) cos(x). It could, for instance, model the return of a game based on the move- 
ment of a pendulum. It could also model the loss incurred by a unit failure whose 
functioning alternate between low and full capacity (failure during low capacity peri- 
ods costing less). As a loss after failure has to be positive, one can consider h{x)+^, 
with /i a positive constant. h{x) is oscillating between local maxima and minima. 



^This does not change further calculations, as E(/i + /^) = IE(/i) + ^. 
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These extrema are solutions o/cos(x) = a;sin(x); 

ai = 0.860, bi = 3.426, 03 = 6.437, 62 = 9.529, 03 12.645, 

63 = 15.771, 04 = 18.902, 64 = 22.036, ag = 25.172, 65 = 28.31. 

We will compute the extremizing distribution for each intervals for i = 

1, . . . , 5, with bo = 0. Let us analyze the first interval [0, 61). The value a G (0, 1) 
in this interval can be found as a root of the equation 

(max(-21n(l - a),0)) ■ cos(max (-2 ln(l - a),0)) 

= (mill (-5 ln(l - a), 3.426)) • cos(min (-5 ln(l - a), 3.426)). 

However, many different values of a ^ (0, 1) are solutions to the above equations. 
Relying on the proof of Proposition and on the various subcases exposed therein 
(see Fig.\^, we should, for a given interval [6^,6^+1), take only root(s) which pro- 
vides the interval [a', a"] such that ai € [a', a"]. For [0, 61), this corresponds to 
a = 0.215, for which values a', a" are 

a' = max(-21n(l - a),bi-i) = max(-21n(l - 0.215), 0) ^ 0.483, 

a" = min(-51n(l - a),bi) = min(-51n(l - 0.215), 3.426) = 1.209. 

It can be seen from the above that ai — 0.860 £ [0.483, 1.209]. We can now deter- 
mine the extremizing distribution function in [0,5i), which is as follows: 

( l-exp(-0.5-a;), a; < 0.483 



This corresponds to the case 2.1. of Figure the "jump" (i.e., probability mass) 
at point bi is of the size 

min (1 - cxp(-0.5 • 3.426), 0.808) - max (1 - cxp(-0.2 • 3.426), 0.215) 0.312. 

Since F(3.426) — F(3.426) ~ 0.33 > 0.312, this means that the extremizing distri- 
bution in [61,62) starts with a constant value F{bi) ~ F(3.426) + 0.312 — 0.808 
and with an horizontal line. Moreover, we can check that 0.808 is the right starting 
point since it is a root of the equation 



And we have a' — 3.426 and a" ~ 8.263 for a. — 0.808. By taking into account the 
analysis of the first interval, we can write 



This correspond to case 2.3. of Figure\^ the jump at 62 has value 9.77 x 10 ^, 
and we have again F(9. 529) - £(9.529) = 0.14 > 9.77 x 10"^^ Analysis for other 
intervals are similar (they all belong to case 2.3.). For the third interval [62,63), 
a = 0.948, a' = 9.529, a" = 14.831 and we have 




0.215, 0.483 <x< 1.209 

l-exp(-0.2-x), 1.209 < a; < 3.426 



max (-2 ln(l - a), 3.426) • cos(max (-2 ln(l - a), 3.426) 
= min (-5 ln(l - a), 9.529) • cos(min (-5 ln(l - a), 9.529) . 




0.808, 3.426 < a; < 8.263 

1 - exp(-0.2 • x), 8.263 < X < 9.529 




0.949, 9.529 < X < 14.831 

l-cxp(-0.2-x), 14. 831< X < 15.771 
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The jump at 63 is of value 2.867 x 10 ^, and for [63, 64), we have a = 0.986, a' = 
15.771, a" = 21.255 and 

. , f 0.986, 15.771 < X < 21.255 

•-^^ ~ \ 1 - exp(-0.2 • a;), 21.255 < x < 22.036 ' 

The jump at 64 is of value 8.189 x 10^'^, and for [64,65), we have a = 0.996, 
a' = 22.036, a" = 27.62 and 

0-996, 22.036 < x < 27.62 

~ \ 1 - exp(-0.2 • x), 27.62 < a; < 28.31 ' 

The jump at point 65 is of the size 3.076 x lO"'^. 

Note that jump sizes decrease as index i increase. This is not true in general, 
and is here due to the particular shape of h{x). By computing the extremizing dis- 
tribution for every interval [6i_i,6i), we can reach the lower expectation. That is, if 
we noteE^ih) the lower expectation of h computed with the extremizing distribution 
obtained for i intervals [6j_i,6j),j = l,...,i, and if h have a finite number of local 
maxima and minima, say r, then E{h) ~ ]E^(/i). However, in this example, r — 00 
andE{h) — liiHr^oo ]Er(^)- Therefore, only an approximate solution can be foun^. 
We can therefore let r increase until |lE^(/i) — < s, with e > Q a prescribed 

precision. For instance, we have 

<.0.483 

= / 0.6xcos(x) • 0.5e-°-^'^da: 
+ / 0.6xcos(a;) ■ 0.2e'"-^''dx 

Jl.209 

+ 0.6- 3.426 cos(3.426) -0.312 
= -0.82. 

Pursuing the computations, we have 

E^ih) = -1.558, E^{h) = -1.9, E^{h) = -2.033, E^{h) = -2.093. 

// we take e = 0.1, then |IE5(^) —E^{h)\ = 0.06 < 0.1, and we consider E^{h) = 
—2.093 as a sufficient approximation of the true (but unknown) lower approxima- 
tion. Upper expectation of h can be obtained by considering the function —h{x) and 
by computing E{—h) . Hence E{h) — ~E{~h) — 1.94 (approximation with e ~ 0.1/ 

This example is useful in two respects: first, it illustrates why it is useful to have 
results concerning the piece-wise extremizing distribution; second, it shows that 
even when analytical calculations are possible, it is not always possible to compute 
an exact value, hence the interest of the generic methods proposed in Section [51 
This is particularly true when h has an infinity of local extrema and when 
have infinite support. It also addresses the question of the choice of levels a when 
many solutions are possible. 

Coming back to numerical approximations using linear programming, our results 
indicates that some regions should be sampled in priority. For example, when com- 
puting lower expectations, one should primarily consider values hi (local minima) 
and sample in neighbourhoods of these values, as it is where probability masses are 
concentrated. The converse (sampling around local maxima) holds when computing 
upper expectations. 



'We assume here that the expectation ]E(fe) exists. 
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If we now consider random set, we can formulate the problem of computing lower 
expectations as follows: let m be the number of local minima, and let 7j, , 7j- be 
the two values bounding the probability mass concentrated on local minima bj , for 
j = l,...,m (for example, for the local minima &2 in Figure [6l we would have 

72. = ai,72* = ^2), then 

(28) E(/i) = ^( / min(/i(a,^),%;))d7+(7(,). -7jJ/i(&,)). 

7=1 

This comes down to sum all the probability masses concentrated on local minima, 
and to calculate integrals when the extremizing distribution coincide either with F 
or F_. Note that, as in Example [H m could be equal to 00. This formulation clearly 
shows that, when using numerical methods with the random set approach, there is 
no need to discretize in finer intervals the intervals [7j", , 7(j)*], as it won't improve 
the precision of the result. 

The case of conditional expectation with general function will not be treated 
here, as it would require long development that wouldn't bring many new ideas. 



6. Conclusions 

We have considered the problem of computing lower and upper expectations on 
p-boxes and particular functions under two different approaches: by using linear 
programming and by using the fact that p-boxes are special cases of random sets. 
Although the two approaches try to solve equivalent problems, their differences 
suggest different ways to approximate the solutions of those problems. As we have 
seen, knowing the behaviour of the function over which lower and upper expecta- 
tions are to be estimated can greatly increase the computational efficiency (and 
even permit analytical computation). 

However, more important than their differences is the complementarity of both 
approaches. Indeed, one approach can shed Hght on some problems obscured by the 
other approach (e.g., the level a of proposition [3]). Another advantage of combining 
both approaches is the ease with which some problems are solved and the elegant 
formulation resulting from this combination (e.g., the conditional case). Let us 
nevertheless note that the constraint programming approach can be applied to 
imprecise probabilities in general, while the random set approach is indeed Hmited 
to random sets. 

In this paper, we have concentrated on the case where uncertainty bears on one 
variable. The case where multiple variables are tainted with uncertainty described 
by p-boxes will be studied in a forthcoming paper. Concerning future work related 
to this topic, three lines of research seem interesting to us: 

• study of other simple representations : it is desirable to achieve similar 
studies for other simple uncertainty representations involving sets of proba- 
bilities. This includes probability intervals 0, possibility distributions [TO] . 
clouds [20j . 

• Discretization schemes : when exact solutions cannot be computed, what is 
the best choice of points xi, . . . ,xn or of levels 71, ... , -fM, respectively to 
approximate the solution by using LP or RS (already mentioned by other 
authors [23]). We have mentioned how our results can possibly help in this 
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task, but proposing generic algorithms and empirically testing them largely 
remains to be done. 

• Convex mixture of functions : in some applications, one can choose a strat- 
egy that is a convex mixture between a finite set of options having utility 
/ii, . . . , his[. For such cases, one often has to find the weights Ai, . . . , Ajv 
such that N ^i^i have the maximal lower expectation. It would be 
interesting to study whether similar results as the ones exposed in this paper 
also exists for this problem when using simple uncertainty representations 
(e.g., p-boxes). 

We would like to end this paper with two final remarks: 

• it is clear from our results that extreme distributions over which the upper 
and lower expectations will be reached will be, in general, discontinuous. 
Since any discontinuous functions can be approximated as close as one 
wants by continuous ones, we do not see it as a big fiaw. However, in 
some cases, it could be desirable to add constraints about which cumulative 
distributions inside [F, F] are admissible. This kind of questions is adressed, 
for example, by Kozine and Krymsky jl5j . 

• We mention at the beginning of the paper that our study is restricted to the 
case where either cumulative distributions were assumed to be cr-additive 
or where h was continuous. Again, this is not a big Hmitation when deahng 
with practical applications, and this avoids many mathematical subtleties 
arising with the consideration of finitely additive probabilities [19] . 
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